CKMorph: a comprehensive morphological analyzer for Central Kurdish

نویسندگان

چکیده

A morphological analyzer, a significant component of many natural language processing applications, especially for morphologically rich languages, divides an input word into all its composing morphemes and identifies their roles. This paper introduces comprehensive analyzer Central Kurdish (CK), also known as Sorani, low-resourced with morphology. Building upon the limited existing literature, we first assembled systematically categorized extensive collection morphophonological rules language. Additionally, collected manually labeled generative lexicon containing nearly 10,000 verb, noun adjective stems, named entities, other types stems. We used these rule sets resources to implement CKMorph Analyzer based on finite-state transducers. In order provide benchmark future research, collected, labeled, publicly shared test evaluating accuracy coverage analyzer. was able correctly analyze 95.9% set, 1000 CK words analyzed according context. Moreover, gave at least one analysis 95.5% 4.22 M tokens second set. The demonstration application resources, including verb database sets, are openly accessible github.com/CKMorph.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Comprehensive Morphological Analyzer for Swedish

SWETWOL is implemented in the framework of Koskenniemi’s (1983) two-level model. It contains a 48,000 item lexicon and a full inflectional description. Special attention was paid to the design of a computational analysis of productive Swedish compounds. Recall (coverage) and precision of SWETWOL meet high standards. SWETWOL has been extensively tested on various types of texts.

متن کامل

Morphological Analyzer for Kokborok

Morphological analysis is concerned with retrieving the syntactic and morphological properties or the meaning of a morphologically complex word. Morphological analysis retrieves the grammatical features and properties of an inflected word. However, this paper introduces the design and implementation of a Morphological Analyzer for Kokborok, a resource constrained and less computerized Indian la...

متن کامل

A Morphological Analyzer for Filipino Verbs

This paper presents a morphological analyzer that accepts Filipino verbs conjugated in different forms as inputs and analyzes them to produce the affixes used, the infinitive forms, and the tenses of the original input verbs. A prototype system was implemented and was fed with a file containing 1,050 Filipino verbs conjugated in various tenses using different types of affixes. The preliminary r...

متن کامل

A morphological Analyzer for Standard Albanian

In this paper, we present a morphological analyzer for standard Albanian intended as a component of an annotation tool in the context of the Albanian Corpus Initiative. The analyzer uses off-line components for generating sub-regular and irregular word forms based on the verb inflector described in Trommer (1997) and simple morphological rules for main inflectional patterns. Part of the analyze...

متن کامل

VenPro: A Morphological Analyzer for Venetan

This document reports the process of extending MorphoPro for Venetan, a lesser-used language spoken in the Nort-Eastern part of Italy. MorphoPro is the morphological component of TextPro, a suite of tools oriented towards a number of NLP tasks. In order to extend this component to Venetan, we developed a declarative representation of the morphological knowledge necessary to analyze and synthesi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: International Journal of Digital Humanities

سال: 2023

ISSN: ['2524-7832', '2524-7840']

DOI: https://doi.org/10.1007/s42803-022-00062-7